InfoMagic Internet Tools 1995 April

home *** CD-ROM | disk | FTP | other *** search

/ InfoMagic Internet Tools 1995 April / Internet Tools.iso / infoserv / www / cern / dev / www-talk.9301-9306.Z / www-talk.9301-9306 / text0540.txt < prev next >

Wrap

Text File | 1995-04-24 | 3.5 KB | 77 lines

Dave Raggett raised some interesting issues in his message. In particular: > Caching >------- > > It will be desirable to avoid overloading servers with popular documents by > supporting a caching scheme at local servers (or even at browsers?). This As well as caching, replication would be nice. But this is only practical if resource identifiers do not contain location information (otherwise replication is only possible by making all the peer servers to appear to be one machine, as in the DNS CNAME suggestion I made some time ago). But if resource identifiers do not contain host information then you need an external means of determining how to reach the resource. This is analagous to routing protocols (an address is not a route ...) Such a system is probably over ambitious for now. Anyway, back to caching ... > Servers need to be able to work out what documents to trash from > their caches. > A simple approach is to compare the date the document was received with the > date it was originally created or last modified. Say it works out that when > you got the document it was already one week old. > Then one rough rule of thumb > is to trash it after another week. You can be rather smarter if there is a > valid expiry date included with the document: I think this is silly. I haven't changed a document for six months, therefore it is safe to say that it won't be changed for the next six months ... This also depends on hosts agreeing on the date. To quote RFC1128, talking about a 1988 survey of the time/date on Internet hosts, "... a few had errors as much as two years" > I think that we need to provide an operation in which the server returns a > document only if it is later that a date/time supplied with the request. This would be useful as part of a replication system, as long as both ends exchanged timestamps initially so that the dates can be synchronised. > Note that servers shouln't cache documents with restricted readership since > each server don't know the restrictions to apply. This requires a further > header to identify such documents as being unsuitable for general caching: and also ... > What happens if a copyright protected document is saved in the cache of a > local server? We have got to ensure that the rightful owners get paid for > access even when the document is obtained from a local server's cache. It may be stating the obvious, but once you allow a user to access you data such that they can save it, there is no technical way you can prevent them from publically redistributing your data. This is a social/legal problem, not a technical one. Accepting that nothing can be done to stop deliberate abuse of licensed information, there is a need to prevent accidental abuse. Probably the simplest way to do this is to mark the document as one which should NOT be cached. Perhaps this leading towards a very simple minded caching scheme a la DNS, where information is returned together with an indication of its "time to live" (TTL), ie how long this can reasonably be cached. Setting a default TTL for a server gives an idea of the "volatility" of the information contained therein Unless a document is exported with world read access, it should always have TTL of 0. Kevin Hoadley, Rutherford Appleton Laboratory, khoadley@directory.rl.ac.uk